{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img align=\"right\" style=\"max-width: 200px; height: auto\" src=\"cfds_logo.png\">\n",
    "\n",
    "###  Lab 09 - \"Deep Learning - Convolutional Neural Networks\"\n",
    "\n",
    "Chartered Financial Data Scientist (CFDS), Spring Term 2020"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the last lab you learned about how to utilize a **supervised** (deep) machine learning technique namely **Artificial Neural Networks (ANNs)** to classify tiny images of handwritten digits contained in the MNIST dataset. \n",
    "\n",
    "In this lab, we will learn how to enhance ANNs using PyTorch to classify even more complex images. Therefore, we use a special type of deep neural network referred to **Convolutional Neural Networks (CNNs)**. CNNs encompass the ability to take advantage of the hierarchical pattern in data and assemble more complex patterns using smaller and simpler patterns. Therefore, CNNs are capable to learn a set of discriminative features 'pattern' and subsequently utilize the learned pattern to classify the content of an image.\n",
    "\n",
    "We will again use the functionality of the `PyTorch` library to implement and train an CNN based neural network. The network will be trained on a set of tiny images to learn a model of the image content. Upon successful training, we will utilize the learned CNN model to classify so far unseen tiny images into distinct categories such as aeroplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks. \n",
    "\n",
    "The figure below illustrates a high-level view on the machine learning process we aim to establish in this lab."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img align=\"center\" style=\"max-width: 900px\" src=\"classification.png\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(Image of the CNN architecture created via http://alexlenail.me/)\n",
    "\n",
    "As always, pls. don't hesitate to ask all your questions either during the lab or send us an email (using our\n",
    "fds.ai email addresses)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Lab Objectives:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After today's lab, you should be able to:\n",
    "\n",
    "> 1. Understand the basic concepts, intuitions and major building blocks of **Convolutional Neural Networks (CNNs)**.\n",
    "> 2. Know how to **implement and to train a CNN** to learn a model of tiny image data.\n",
    "> 3. Understand how to apply such a learned model to **classify images** images based on their content into distinct categories.\n",
    "> 4. Know how to **interpret and visualize** the model's classification results."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup of the Jupyter Notebook Environment"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Similar to the previous labs, we need to import a couple of Python libraries that allow for data analysis and data visualization. We will mostly use the `PyTorch`, `Numpy`, `Sklearn`, `Matplotlib`, `Seaborn` and a few utility libraries throughout this lab:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# import standard python libraries\n",
    "import os, urllib, io\n",
    "from datetime import datetime\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Import Python machine / deep learning libraries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# import the PyTorch deep learning library\n",
    "import torch, torchvision\n",
    "import torch.nn.functional as F\n",
    "from torch import nn, optim\n",
    "from torch.autograd import Variable"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Import the sklearn classification metrics:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# import sklearn classification evaluation library\n",
    "from sklearn import metrics\n",
    "from sklearn.metrics import classification_report, confusion_matrix"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Import Python plotting libraries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# import matplotlib, seaborn, and PIL data visualization libary\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "from PIL import Image"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Enable notebook matplotlib inline plotting:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create notebook folder structure to store the data as well as the trained neural network models:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if not os.path.exists('./data'): os.makedirs('./data')  # create data directory\n",
    "if not os.path.exists('./models'): os.makedirs('./models')  # create trained models directory"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Set a random `seed` value to obtain reproducable results:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# init deterministic seed\n",
    "seed_value = 1234\n",
    "np.random.seed(seed_value) # set numpy seed\n",
    "torch.manual_seed(seed_value) # set pytorch seed CPU"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Enable GPU computing by setting the `device` flag and init a `CUDA` seed:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set cpu or gpu enabled device\n",
    "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu').type\n",
    "\n",
    "# init deterministic GPU seed\n",
    "torch.cuda.manual_seed(seed_value)\n",
    "\n",
    "# log type of device enabled\n",
    "print('[LOG] notebook with {} computation enabled'.format(str(device)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's determine if we have access to a GPU provided by e.g. Google's COLab environment:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!nvidia-smi"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1. Dataset Download and Data Assessment"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The **CIFAR-10 database** (**C**anadian **I**nstitute **F**or **A**dvanced **R**esearch) is a collection of images that are commonly used to train machine learning and computer vision algorithms. The database is widely used to conduct computer vision research using machine learning and deep learning methods:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img align=\"center\" style=\"max-width: 500px; height: 500px\" src=\"cifar10.png\">\n",
    "\n",
    "(Source: https://www.kaggle.com/c/cifar-10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Further details on the dataset can be obtained via: *Krizhevsky, A., 2009. \"Learning Multiple Layers of Features from Tiny Images\",  \n",
    "( https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf ).\"*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The CIFAR-10 database contains **60,000 color images** (50,000 training images and 10,000 validation images). The size of each image is 32 by 32 pixels. The collection of images encompasses 10 different classes that represent airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks. Let's define the distinct classs for further analytics:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cifar10_classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Thereby the dataset contains 6,000 images for each of the ten classes. The CIFAR-10 is a straightforward dataset that can be used to teach a computer how to recognize objects in images.\n",
    "\n",
    "Let's download, transform and inspect the training images of the dataset. Therefore, we first will define the directory we aim to store the training data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_path = './data/train_cifar10'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's download the training data accordingly:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define pytorch transformation into tensor format\n",
    "transf = torchvision.transforms.Compose([torchvision.transforms.ToTensor(), torchvision.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])\n",
    "\n",
    "# download and transform training images\n",
    "cifar10_train_data = torchvision.datasets.CIFAR10(root=train_path, train=True, transform=transf, download=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Verify the volume of training images downloaded:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# get the length of the training data\n",
    "len(cifar10_train_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Furthermore, let's investigate a couple of the training images:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set (random) image id\n",
    "image_id = 1800\n",
    "\n",
    "# retrieve image exhibiting the image id\n",
    "cifar10_train_data[image_id]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cifar10_train_image.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ok, that doesn't seem easily interpretable ;) Let's first seperate the image from its label information:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cifar10_train_image, cifar10_train_label = cifar10_train_data[image_id]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Great, now we are able to visually inspect our sample image: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define tensor to image transformation\n",
    "trans = torchvision.transforms.ToPILImage()\n",
    "\n",
    "# set image plot title \n",
    "plt.title('Example: {}, Label: \"{}\"'.format(str(image_id), str(cifar10_classes[cifar10_train_label])))\n",
    "\n",
    "# un-normalize cifar 10 image sample\n",
    "cifar10_train_image_plot = cifar10_train_image / 2.0 + 0.5\n",
    "\n",
    "# plot 10 image sample\n",
    "plt.imshow(trans(cifar10_train_image_plot))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Fantastic, right? Let's now decide on where we want to store the evaluation data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "eval_path = './data/eval_cifar10'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And download the evaluation data accordingly:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define pytorch transformation into tensor format\n",
    "transf = torchvision.transforms.Compose([torchvision.transforms.ToTensor(), torchvision.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])\n",
    "\n",
    "# download and transform validation images\n",
    "cifar10_eval_data = torchvision.datasets.CIFAR10(root=eval_path, train=False, transform=transf, download=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Verify the volume of validation images downloaded:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# get the length of the training data\n",
    "len(cifar10_eval_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. Neural Network Implementation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this section we, will implement the architecture of the **neural network** we aim to utilize to learn a model that is capable of classifying the 32x32 pixel CIFAR 10 images according to the objects contained in each image. However, before we start the implementation, let's briefly revisit the process to be established. The following cartoon provides a birds-eye view:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img align=\"center\" style=\"max-width: 900px\" src=\"process.png\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our CNN, which we name 'CIFAR10Net' and aim to implement consists of two **convolutional layers** and three **fully-connected layers**. In general, convolutional layers are specifically designed to learn a set of **high-level features** (\"patterns\") in the processed images, e.g., tiny edges and shapes. The fully-connected layers utilize the learned features to learn **non-linear feature combinations** that allow for highly accurate classification of the image content into the different image classes of the CIFAR-10 dataset, such as, birds, aeroplanes, horses."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's implement the network architecture and subsequently have a more in-depth look into its architectural details:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# implement the CIFAR10Net network architecture\n",
    "class CIFAR10Net(nn.Module):\n",
    "    \n",
    "    # define the class constructor\n",
    "    def __init__(self):\n",
    "        \n",
    "        # call super class constructor\n",
    "        super(CIFAR10Net, self).__init__()\n",
    "        \n",
    "        # specify convolution layer 1\n",
    "        self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5, stride=1, padding=0)\n",
    "        \n",
    "        # define max-pooling layer 1\n",
    "        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)\n",
    "        \n",
    "        # specify convolution layer 2\n",
    "        self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5, stride=1, padding=0)\n",
    "        \n",
    "        # define max-pooling layer 2\n",
    "        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)\n",
    "        \n",
    "        # specify fc layer 1 - in 16 * 5 * 5, out 120\n",
    "        self.linear1 = nn.Linear(16 * 5 * 5, 120, bias=True) # the linearity W*x+b\n",
    "        self.relu1 = nn.ReLU(inplace=True) # the non-linearity\n",
    "        \n",
    "        # specify fc layer 2 - in 120, out 84\n",
    "        self.linear2 = nn.Linear(120, 84, bias=True) # the linearity W*x+b\n",
    "        self.relu2 = nn.ReLU(inplace=True) # the non-linarity\n",
    "        \n",
    "        # specify fc layer 3 - in 84, out 10\n",
    "        self.linear3 = nn.Linear(84, 10) # the linearity W*x+b\n",
    "        \n",
    "        # add a softmax to the last layer\n",
    "        self.logsoftmax = nn.LogSoftmax(dim=1) # the softmax\n",
    "        \n",
    "    # define network forward pass\n",
    "    def forward(self, images):\n",
    "        \n",
    "        # high-level feature learning via convolutional layers\n",
    "        \n",
    "        # define conv layer 1 forward pass\n",
    "        x = self.pool1(self.relu1(self.conv1(images)))\n",
    "        \n",
    "        # define conv layer 2 forward pass\n",
    "        x = self.pool2(self.relu2(self.conv2(x)))\n",
    "        \n",
    "        # feature flattening\n",
    "        \n",
    "        # reshape image pixels\n",
    "        x = x.view(-1, 16 * 5 * 5)\n",
    "        \n",
    "        # combination of feature learning via non-linear layers\n",
    "        \n",
    "        # define fc layer 1 forward pass\n",
    "        x = self.relu1(self.linear1(x))\n",
    "        \n",
    "        # define fc layer 2 forward pass\n",
    "        x = self.relu2(self.linear2(x))\n",
    "        \n",
    "        # define layer 3 forward pass\n",
    "        x = self.logsoftmax(self.linear3(x))\n",
    "        \n",
    "        # return forward pass result\n",
    "        return x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You may have noticed that we applied two more layers (compared to the MNIST example described in the last lab) before the fully-connected layers. These layers are referred to as **convolutional** layers and are usually comprised of three operations, (1) **convolution**, (2) **non-linearity**, and (3) **max-pooling**. Those operations are usually executed in sequential order during the forward pass through a convolutional layer."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the following, we will have a detailed look into the functionality and number of parameters in each layer. We will start with providing images of 3x32x32 dimensions to the network, i.e., the three channels (red, green, blue) of an image each of size 32x32 pixels."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 2.1. High-Level Feature Learning by Convolutional Layers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's first have a look into the convolutional layers of the network as illustrated in the following:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img align=\"center\" style=\"max-width: 600px\" src=\"convolutions.png\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**First Convolutional Layer**: The first convolutional layer expects three input channels and will convolve six filters each of size 3x5x5. Let's briefly revisit how we can perform a convolutional operation on a given image. For that, we need to define a kernel which is a matrix of size 5x5, for example. To perform the convolution operation, we slide the kernel along with the image horizontally and vertically and obtain the dot product of the kernel and the pixel values of the image inside the kernel ('receptive field' of the kernel)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following illustration shows an example of a discrete convolution:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img align=\"center\" style=\"max-width: 800px\" src=\"convsample.png\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The left grid is called the input (an image or feature map). The middle grid, referred to as kernel, slides across the input feature map (or image). At each location, the product between each element of the kernel and the input element it overlaps is computed, and the results are summed up to obtain the output in the current location. In general, a discrete convolution is mathematically expressed by:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center> $y(m, n) = x(m, n) * h(m, n) = \\sum^{m}_{j=0} \\sum^{n}_{i=0} x(i, j) * h(m-i, n-j)$, </center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "where $x$ denotes the input image or feature map, $h$ the applied kernel, and, $y$ the output."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When performing the convolution operation the 'stride' defines the number of pixels to pass at a time when sliding the kernel over the input. While 'padding' adds the number of pixels to the input image (or feature map) to ensure that the output has the same shape as the input. Let's have a look at another animated example:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img align=\"center\" style=\"max-width: 800px\" src=\"convsample_animated.gif\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(Source: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53)\n",
    "\n",
    "In our implementation padding is set to 0 and stride is set to 1. As a result, the output size of the convolutional layer becomes 6x28x28, because (32 - 5) + 1 = 28. This layer exhibits ((5 x 5 x 3) + 1) x 6 = 456 parameter. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**First Max-Pooling Layer:** The max-pooling process is a sample-based discretization operation. The objective is to down-sample an input representation (image, hidden-layer output matrix, etc.), reducing its dimensionality and allowing for assumptions to be made about features contained in the sub-regions binned.\n",
    "\n",
    "To conduct such an operation, we again need to define a kernel. Max-pooling kernels are usually a tiny matrix of, e.g, of size 2x2. To perform the max-pooling operation, we slide the kernel along the image horizontally and vertically (similarly to a convolution) and compute the maximum pixel value of the image (or feature map) inside the kernel (the receptive field of the kernel)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following illustration shows an example of a max-pooling operation:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img align=\"center\" style=\"max-width: 500px\" src=\"poolsample.png\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The left grid is called the input (an image or feature map). The middle grid, referred to as kernel, slides across the input feature map (or image). We use a stride of 2, meaning the step distance for stepping over our input will be 2 pixels and won't overlap regions. At each location, the max value of the region that overlaps with the elements of the kernel and the input elements it overlaps is computed, and the results are obtained in the output of the current location."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In our implementation, we do max-pooling with a 2x2 kernel and stride 2 this effectively drops the original image size from 6x28x28 to 6x14x14. Let's have a look at an exemplary visualization of 64 features learnt in the first convolutional layer on the CIFAR- 10 dataset."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img align=\"center\" style=\"max-width: 600px\" src=\"cnnfeatures.png\">\n",
    "\n",
    "(Source: Yu, Dingjun, Hanli Wang, Peiqiu Chen, and Zhihua Wei. **\"Mixed pooling for convolutional neural networks.\"** In International conference on rough sets and knowledge technology, pp. 364-375. Springer, Cham, 2014)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Second Convolutional Layer:** The second convolutional layer expects 6 input channels and will convolve 16 filters each of size 6x5x5x. Since padding is set to 0 and stride is set 1, the output size is 16x10x10, because (14  - 5) + 1 = 10. This layer therefore has ((5 x 5 x 6) + 1 x 16) = 24,16 parameter.\n",
    "\n",
    "**Second Max-Pooling Layer:** The second down-sampling layer uses max-pooling with 2x2 kernel and stride set to 2. This effectively drops the size from 16x10x10 to 16x5x5. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 2.2. Feature Flattening\n",
    "\n",
    "The output of the final-max pooling layer needs to be flattened so that we can connect it to a fully connected layer. This is achieved using the `torch.Tensor.view` method. Setting the parameter of the method to `-1` will automatically infer the number of rows required to handle the mini-batch size of the data. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 2.3. Learning of Feature Combinations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's now have a look into the non-linear layers of the network illustrated in the following:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img align=\"center\" style=\"max-width: 600px\" src=\"fullyconnected.png\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The first fully connected layer uses 'Rectified Linear Units' (ReLU) activation functions to learn potential nonlinear combinations of features. The layers are implemented similarly to the fifth lab. Therefore, we will only focus on the number of parameters of each fully-connected layer:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**First Fully-Connected Layer:** The first fully-connected layer consists of 120 neurons, thus in total exhibits ((16 x 5 x 5) + 1) x 120 = 48,120 parameter. \n",
    "\n",
    "**Second Fully-Connected Layer:** The output of the first fully-connected layer is then transferred to second fully-connected layer. The layer consists of 84 neurons equipped with ReLu activation functions, this in total exhibits (120 + 1) x 84 = 10,164 parameter."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The output of the second fully-connected layer is then transferred to the output-layer (third fully-connected layer). The output layer is equipped with a softmax (that you learned about in the previous lab 05) and is made up of ten neurons, one for each object class contained in the CIFAR-10 dataset. This layer exhibits (84 + 1) x 10 = 850 parameter.\n",
    "\n",
    "\n",
    "As a result our CIFAR-10 convolutional neural exhibits a total of 456 + 2,416 + 48,120 + 10,164 + 850 = 62,006 parameter.\n",
    "\n",
    "(Source: https://www.stefanfiott.com/machine-learning/cifar-10-classifier-using-cnn-in-pytorch/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, that we have implemented our first Convolutional Neural Network we are ready to instantiate a network model to be trained:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = CIFAR10Net()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's push the initialized `CIFAR10Net` model to the computing `device` that is enabled:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = model.to(device)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's double check if our model was deployed to the GPU if available:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!nvidia-smi"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once the model is initialized we can visualize the model structure and review the implemented network architecture by execution of the following cell:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# print the initialized architectures\n",
    "print('[LOG] CIFAR10Net architecture:\\n\\n{}\\n'.format(model))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looks like intended? Brilliant! Finally, let's have a look into the number of model parameters that we aim to train in the next steps of the notebook:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# init the number of model parameters\n",
    "num_params = 0\n",
    "\n",
    "# iterate over the distinct parameters\n",
    "for param in model.parameters():\n",
    "\n",
    "    # collect number of parameters\n",
    "    num_params += param.numel()\n",
    "    \n",
    "# print the number of model paramters\n",
    "print('[LOG] Number of to be trained CIFAR10Net model parameters: {}.'.format(num_params))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ok, our \"simple\" CIFAR10Net model already encompasses an impressive number 62'006 model parameters to be trained."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have implemented the CIFAR10Net, we are ready to train the network. However, before starting the training, we need to define an appropriate loss function. Remember, we aim to train our model to learn a set of model parameters $\\theta$ that minimize the classification error of the true class $c^{i}$ of a given CIFAR-10 image $x^{i}$ and its predicted class $\\hat{c}^{i} = f_\\theta(x^{i})$ as faithfully as possible. \n",
    "\n",
    "In this lab we use (similarly to lab 05) the **'Negative Log-Likelihood (NLL)'** loss. During training the NLL loss will penalize models that result in a high classification error between the predicted class labels $\\hat{c}^{i}$ and their respective true class label $c^{i}$. Now that we have implemented the CIFAR10Net, we are ready to train the network. Before starting the training, we need to define an appropriate loss function. Remember, we aim to train our model to learn a set of model parameters $\\theta$ that minimize the classification error of the true class $c^{i}$ of a given CIFAR-10 image $x^{i}$ and its predicted class $\\hat{c}^{i} = f_\\theta(x^{i})$ as faithfully as possible. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's instantiate the NLL via the execution of the following PyTorch command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define the optimization criterion / loss function\n",
    "nll_loss = nn.NLLLoss()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's also push the initialized `nll_loss` computation to the computing `device` that is enabled:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nll_loss = nll_loss.to(device)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Based on the loss magnitude of a certain mini-batch PyTorch automatically computes the gradients. But even better, based on the gradient, the library also helps us in the optimization and update of the network parameters $\\theta$.\n",
    "\n",
    "We will use the **Stochastic Gradient Descent (SGD) optimization** and set the `learning-rate to 0.001`. Each mini-batch step the optimizer will update the model parameters $\\theta$ values according to the degree of classification error (the NLL loss)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define learning rate and optimization strategy\n",
    "learning_rate = 0.001\n",
    "optimizer = optim.SGD(params=model.parameters(), lr=learning_rate)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have successfully implemented and defined the three CNN building blocks let's take some time to review the `CIFAR10Net` model definition as well as the `loss`. Please, read the above code and comments carefully and don't hesitate to let us know any questions you might have."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3. Training the Neural Network Model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this section, we will train our neural network model (as implemented in the section above) using the transformed images. More specifically, we will have a detailed look into the distinct training steps as well as how to monitor the training progress."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 3.1. Preparing the Network Training"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So far, we have pre-processed the dataset, implemented the CNN and defined the classification error. Let's now start to train a corresponding model for **20 epochs** and a **mini-batch size of 128** CIFAR-10 images per batch. This implies that the whole dataset will be fed to the CNN 20 times in chunks of 4 images yielding to **12,500 mini-batches** (50.000 training images / 4 images per mini-batch) per epoch. After the processing of each mini-batch, the parameters of the network will be updated. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# specify the training parameters\n",
    "num_epochs = 20 # number of training epochs\n",
    "mini_batch_size = 4 # size of the mini-batches"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Furthermore, lets specifiy and instantiate a corresponding PyTorch data loader that feeds the image tensors to our neural network:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cifar10_train_dataloader = torch.utils.data.DataLoader(cifar10_train_data, batch_size=mini_batch_size, shuffle=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 3.2. Running the Network Training"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we start training the model. The training procedure for each mini-batch is performed as follows: \n",
    "\n",
    ">1. do a forward pass through the CIFAR10Net network, \n",
    ">2. compute the negative log-likelihood classification error $\\mathcal{L}^{NLL}_{\\theta}(c^{i};\\hat{c}^{i})$, \n",
    ">3. do a backward pass through the CIFAR10Net network, and \n",
    ">4. update the parameters of the network $f_\\theta(\\cdot)$.\n",
    "\n",
    "To ensure learning while training our CNN model, we will monitor whether the loss decreases with progressing training. Therefore, we obtain and evaluate the classification performance of the entire training dataset after each training epoch. Based on this evaluation, we can conclude on the training progress and whether the loss is converging (indicating that the model might not improve any further).\n",
    "\n",
    "The following elements of the network training code below should be given particular attention:\n",
    " \n",
    ">- `loss.backward()` computes the gradients based on the magnitude of the reconstruction loss,\n",
    ">- `optimizer.step()` updates the network parameters based on the gradient."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# init collection of training epoch losses\n",
    "train_epoch_losses = []\n",
    "\n",
    "# set the model in training mode\n",
    "model.train()\n",
    "\n",
    "# train the CIFAR10 model\n",
    "for epoch in range(num_epochs):\n",
    "    \n",
    "    # init collection of mini-batch losses\n",
    "    train_mini_batch_losses = []\n",
    "    \n",
    "    # iterate over all-mini batches\n",
    "    for i, (images, labels) in enumerate(cifar10_train_dataloader):\n",
    "        \n",
    "        # push mini-batch data to computation device\n",
    "        images = images.to(device)\n",
    "        labels = labels.to(device)\n",
    "\n",
    "        # run forward pass through the network\n",
    "        output = model(images)\n",
    "        \n",
    "        # reset graph gradients\n",
    "        model.zero_grad()\n",
    "        \n",
    "        # determine classification loss\n",
    "        loss = nll_loss(output, labels)\n",
    "        \n",
    "        # run backward pass\n",
    "        loss.backward()\n",
    "        \n",
    "        # update network paramaters\n",
    "        optimizer.step()\n",
    "        \n",
    "        # collect mini-batch reconstruction loss\n",
    "        train_mini_batch_losses.append(loss.data.item())\n",
    "\n",
    "    # determine mean min-batch loss of epoch\n",
    "    train_epoch_loss = np.mean(train_mini_batch_losses)\n",
    "    \n",
    "    # print epoch loss\n",
    "    now = datetime.utcnow().strftime(\"%Y%m%d-%H:%M:%S\")\n",
    "    print('[LOG {}] epoch: {} train-loss: {}'.format(str(now), str(epoch), str(train_epoch_loss)))\n",
    "    \n",
    "    # save model to local directory\n",
    "    model_name = 'cifar10_model_epoch_{}.pth'.format(str(epoch))\n",
    "    torch.save(model.state_dict(), os.path.join(\"./models\", model_name))\n",
    "    \n",
    "    # determine mean min-batch loss of epoch\n",
    "    train_epoch_losses.append(train_epoch_loss)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Upon successfull training let's visualize and inspect the training loss per epoch:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# prepare plot\n",
    "fig = plt.figure()\n",
    "ax = fig.add_subplot(111)\n",
    "\n",
    "# add grid\n",
    "ax.grid(linestyle='dotted')\n",
    "\n",
    "# plot the training epochs vs. the epochs' classification error\n",
    "ax.plot(np.array(range(1, len(train_epoch_losses)+1)), train_epoch_losses, label='epoch loss (blue)')\n",
    "\n",
    "# add axis legends\n",
    "ax.set_xlabel(\"[training epoch $e_i$]\", fontsize=10)\n",
    "ax.set_ylabel(\"[Classification Error $\\mathcal{L}^{NLL}$]\", fontsize=10)\n",
    "\n",
    "# set plot legend\n",
    "plt.legend(loc=\"upper right\", numpoints=1, fancybox=True)\n",
    "\n",
    "# add plot title\n",
    "plt.title('Training Epochs $e_i$ vs. Classification Error $L^{NLL}$', fontsize=10);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ok, fantastic. The training error converges nicely. We could definitely train the network a couple more epochs until the error converges. But let's stay with the 20 training epochs for now and continue with evaluating our trained model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4. Evaluation of the Trained Neural Network Model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Prior to evaluating our model, let's load the best performing model. Remember, that we stored a snapshot of the model after each training epoch to our local model directory. We will now load the last snapshot saved."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# restore pre-trained model snapshot\n",
    "best_model_name = \"cifar10_model_epoch_19.pth\"\n",
    "\n",
    "# init pre-trained model class\n",
    "best_model = CIFAR10Net()\n",
    "\n",
    "# load pre-trained models\n",
    "best_model.load_state_dict(torch.load(os.path.join(\"models\", best_model_name), map_location=torch.device('cpu')))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's inspect if the model was loaded successfully: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set model in evaluation mode\n",
    "best_model.eval()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to evaluate our trained model, we need to feed the CIFAR10 images reserved for evaluation (the images that we didn't use as part of the training process) through the model. Therefore, let's again define a corresponding PyTorch data loader that feeds the image tensors to our neural network: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cifar10_eval_dataloader = torch.utils.data.DataLoader(cifar10_eval_data, batch_size=10000, shuffle=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will now evaluate the trained model using the same mini-batch approach as we did when training the network and derive the mean negative log-likelihood loss of all mini-batches processed in an epoch:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# init collection of mini-batch losses\n",
    "eval_mini_batch_losses = []\n",
    "\n",
    "# iterate over all-mini batches\n",
    "for i, (images, labels) in enumerate(cifar10_eval_dataloader):\n",
    "\n",
    "    # run forward pass through the network\n",
    "    output = best_model(images)\n",
    "\n",
    "    # determine classification loss\n",
    "    loss = nll_loss(output, labels)\n",
    "\n",
    "    # collect mini-batch reconstruction loss\n",
    "    eval_mini_batch_losses.append(loss.data.item())\n",
    "\n",
    "# determine mean min-batch loss of epoch\n",
    "eval_loss = np.mean(eval_mini_batch_losses)\n",
    "\n",
    "# print epoch loss\n",
    "now = datetime.utcnow().strftime(\"%Y%m%d-%H:%M:%S\")\n",
    "print('[LOG {}] eval-loss: {}'.format(str(now), str(eval_loss)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ok, great. The evaluation loss looks in-line with our training loss. Let's now inspect a few sample predictions to get an impression of the model quality. Therefore, we will again pick a random image of our evaluation dataset and retrieve its PyTorch tensor as well as the corresponding label:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set (random) image id\n",
    "image_id = 777\n",
    "\n",
    "# retrieve image exhibiting the image id\n",
    "cifar10_eval_image, cifar10_eval_label = cifar10_eval_data[image_id]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's now inspect the true class of the image we selected:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cifar10_classes[cifar10_eval_label]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ok, the randomly selected image should contain a two (2). Let's inspect the image accordingly:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define tensor to image transformation\n",
    "trans = torchvision.transforms.ToPILImage()\n",
    "\n",
    "# set image plot title \n",
    "plt.title('Example: {}, Label: {}'.format(str(image_id), str(cifar10_classes[cifar10_eval_label])))\n",
    "\n",
    "# un-normalize cifar 10 image sample\n",
    "cifar10_eval_image_plot = cifar10_eval_image / 2.0 + 0.5\n",
    "\n",
    "# plot cifar 10 image sample\n",
    "plt.imshow(trans(cifar10_eval_image_plot))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ok, let's compare the true label with the prediction of our model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "best_model(cifar10_eval_image.unsqueeze(0))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can even determine the likelihood of the most probable class:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cifar10_classes[torch.argmax(best_model(Variable(cifar10_eval_image.unsqueeze(0))), dim=1).item()]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's now obtain the predictions for all the CIFAR-10 images of the evaluation data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "predictions = torch.argmax(best_model(iter(cifar10_eval_dataloader).next()[0]), dim=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Furthermore, let's obtain the overall classification accuracy:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "metrics.accuracy_score(cifar10_eval_data.targets, predictions.detach())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's also inspect the confusion matrix of the model predictions to determine major sources of misclassification:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# determine classification matrix of the predicted and target classes\n",
    "mat = confusion_matrix(cifar10_eval_data.targets, predictions.detach())\n",
    "\n",
    "# plot corresponding confusion matrix\n",
    "sns.heatmap(mat.T, square=True, annot=True, fmt='d', cbar=False, cmap='YlOrRd_r', xticklabels=cifar10_classes, yticklabels=cifar10_classes)\n",
    "plt.title('CIFAR-10 classification matrix')\n",
    "plt.xlabel('[true label]')\n",
    "plt.ylabel('[predicted label]');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ok, we can easily see that our current model confuses images of cats and dogs as well as images of trucks and cars quite often. This is again not surprising since those image categories exhibit a high semantic and therefore visual similarity."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercises:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We recommend you try the following exercises as part of the lab:\n",
    "\n",
    "**1. Train the network a couple more epochs and evaluate its prediction accuracy.**\n",
    "\n",
    "> Increase the number of training epochs up to 50 epochs and re-run the network training. Load and evaluate the model exhibiting the lowest training loss. What kind of behavior in terms of prediction accuracy can be observed with increasing the training epochs?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# ***************************************************\n",
    "# INSERT YOUR CODE HERE\n",
    "# ***************************************************"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**2. Evaluation of \"shallow\" vs. \"deep\" neural network architectures.**\n",
    "\n",
    "> In addition to the architecture of the lab notebook, evaluate further (more shallow as well as more deep) neural network architectures by (1) either removing or adding layers to the network and/or (2) increasing/decreasing the number of neurons per layer. Train a model (using the architectures you selected) for at least 50 training epochs. Analyze the prediction performance of the trained models in terms of training time and prediction accuracy. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# ***************************************************\n",
    "# INSERT YOUR CODE HERE\n",
    "# ***************************************************"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Lab Summary:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this lab, a step by step introduction into **design, implementation, training and evaluation** of convolutional neural networks CNNs to classify tiny images of objects is presented. The code and exercises presented in this lab may serves as a starting point for developing more complex, deeper and more tailored CNNs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You may want to execute the content of your lab outside of the Jupyter notebook environment, e.g. on a compute node or a server. The cell below converts the lab notebook into a standalone and executable python script. Pls. note that to convert the notebook, you need to install Python's **nbconvert** library and its extensions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# installing the nbconvert library\n",
    "!pip3 install nbconvert\n",
    "!pip3 install jupyter_contrib_nbextensions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's now convert the Jupyter notebook into a plain Python script:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!jupyter nbconvert --to script cfds_lab_09.ipynb"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "calc(100% - 180px)",
    "left": "10px",
    "top": "150px",
    "width": "254.39999389648438px"
   },
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
